Proto-Indo-European Lexicon: The Generative Etymological Dictionary of Indo-European Languages

نویسنده

  • Jouna Pyysalo
چکیده

Proto-Indo-European Lexicon (PIE Lexicon) is the generative etymological dictionary of Indo-European languages. The reconstruction of Proto-Indo-European (PIE) is obtained by applying the comparative method, the output of which equals the Indo-European (IE) data. Due to this the Indo-European sound laws leading from PIE to IE, revised in Pyysalo 2013, can be coded using Finite-State Transducers (FST). For this purpose the foma finite-state compiler by Mans Hulden (2009) has been chosen in PIE Lexicon. At this point PIE Lexicon generates data of some 120 Indo-European languages with an accuracy rate of over 99% and is therefore the first dictionary in the world capable of generating (predicting) its data entries by means of digitized sound laws.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Problems in-Computerized Historical Linguistics: the Old Cornish Lexicon

This work represents an attempt to utilize the computer in solving problems in historical linguistics. The corpus upon which it operates is not a language but a recently published etymological dictionary of Old Cornish. 1 Any observations regarding the scarcity or inaccuracy of the data utilized are, therefore, irrelevant, as far as the present paper is concerned. As the dictionary in question ...

متن کامل

2. IE & IEs

PROTO-INDO-EUROPEAN is the traditional name given to the ancestor language of the Indo-European family that is spread from Iceland to Chinese Turkestan and from Scandinavia to the Near East. A PROTO-LANGUAGE (Gk. prõtos ‘first’) refers to the earliest form of a language family presupposed by all of its descendants. There will forever be major gaps in the reconstruction of proto-languages, but a...

متن کامل

Dictionary Organization in Linguistic Automaton for Oriental Languages

The central problem for natural language processing (NLP) systems dealing with non-Indo-European (“Oriental”) languages is how to develop automatic dictionaries (AD) and dictionary entry (DE) schemes. The point is that the need of Oriental language industrial NLP has been felt for some time. It has acquired additional urgency with the rapid growth of business contacts between Russia and the nat...

متن کامل

Learning an English-chinese Lexicon from a Parallel Corpus

We report experiments on automatic learning of an English-Chinese translation lexicon, through statistical training on a large parallel corpus. The learned vocabulary size is nontrivial at 6,517 English words averaging 2.33 Chinese translations per entry, with a manuallyfiltered precision of 95.1% and a single-most-probable precision of 91.2%. We then introduce a significance filtering method t...

متن کامل

Phonaesthemic and Etymological effects on the Distribution of Senses in Statistical Models of Semantics

This paper uses methods based on corpus statistics and synonymy to explore the role language history and sound/form relationships play in conceptual organization through a case study relating the phonaestheme glto its prevalent Proto-Indo European root, *ghel. The results of both methods point to a strong link between the phonaestheme and the historical root, suggesting that the lineage of a la...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017